Effects of Parsing Errors on Pre-Reordering Performance for Chinese-to-Japanese SMT

نویسندگان

  • Dan Han
  • Pascual Martínez-Gómez
  • Yusuke Miyao
  • Katsuhito Sudoh
  • Masaaki Nagata
چکیده

Linguistically motivated reordering methods have been developed to improve word alignment especially for Statistical Machine Translation (SMT) on long distance language pairs. However, since they highly rely on the parsing accuracy, it is useful to explore the relationship between parsing and reordering. For Chinese-toJapanese SMT, we carry out a three-stage incremental comparative analysis to observe the effects of different parsing errors on reordering performance by combining empirical and descriptive approaches. For the empirical approach, we quantify the distribution of general parsing errors along with reordering qualities whereas for the descriptive approach, we extract seven influential error patterns and examine their correlation with reordering errors.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analyzing the Influence of Parsing Errors on Pre-reordering Performance for SMT

Word alignment for long distance language pairs is problematic in state-of-the-art phrasebased statistical machine translation. Linguistically motivated reordering models have been widely studied to conquer this challenge. One of the most popular and effective methods is called pre-reordering, where words in sentences from the source language are re-arranged with the objective to resemble the w...

متن کامل

Pre-Reordering for Neural Machine Translation: Helpful or Harmful?

Pre-reordering, a preprocessing to make the source-side word orders close to those of the target side, has been proven very helpful for statistical machine translation (SMT) in improving translation quality. However, is it the case in neural machine translation (NMT)? In this paper, we firstly investigate the impact of pre-reordered source-side data onNMT, and then propose to incorporate featur...

متن کامل

Word Order Does NOT Differ Significantly Between Chinese and Japanese

We propose a pre-reordering approach for Japanese-to-Chinese statistical machine translation (SMT). The approach uses dependency structure and manually designed reordering rules to arrange morphemes of Japanese sentences into Chinese-like word order, before a baseline phrase-based (PB) SMT system applied. Experimental results on the ASPEC-JC data show that the improvement of the proposed pre-re...

متن کامل

Weblio Pre-reordering Statistical Machine Translation System

This paper describes details of the Weblio Pre-reordering Statistical Machine Translation (SMT) System, participated in the English-Japanese translation task of 1st Workshop on Asian Translation (WAT2014). In this system, we applied the pre-reordering method described in (Zhu et al., 2014), and extended the model to obtain N -best pre-reordering results. We also utilized N -best parse trees sim...

متن کامل

Phrase Reordering Model Integrating Syntactic Knowledge for SMT

Reordering model is important for the statistical machine translation (SMT). Current phrase-based SMT technologies are good at capturing local reordering but not global reordering. This paper introduces syntactic knowledge to improve global reordering capability of SMT system. Syntactic knowledge such as boundary words, POS information and dependencies is used to guide phrase reordering. Not on...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013